13 research outputs found

    Supervised Identification of Writer\u27s Native Language Based on Their English Word Usage

    Get PDF
    In this paper, we investigate the possibility of constructing an automated tool for the writer\u27s first language detection based on a~document written in their second language. Since English is the contemporary lingua franca, commonly used by non-native speakers, we have chosen it to be the second language to study. In this paper, we examine English texts from computer science, a field related to mathematics. More generally, we wanted to study texts from a domain that operates with formal rules. We were able to achieve a high classification rate, about~90\%, using a relatively simple model (n-grams with logistic regression). We trained the model to distinguish twelve nationality groups/first languages based on our dataset. The classification mechanism was implemented using logistic regression with L1~regularisation, which performed well with sparse document-term data table. The experiment proved that we can use vocabulary alone to detect the first language with high accuracy

    Time Series Classification Using Images

    Get PDF
    This work is a contribution to the field of time series classification. We propose a novel method that transforms time series into multi-channel images, which are then classified using Convolutional Neural Networks as an at-hand classifier. We present different variants of the proposed method. Time series with different characteristics are studied in this paper: univariate, multivariate, and varying lengths. Several selected methods of time-series-to-image transformation are considered, taking into account the original series values, value changes (first differentials), and changes in value changes (second differentials). In the paper, we present an empirical study demonstrating the quality of time series classification using the proposed approach

    Automatic Data Understanding: the Tool for Intelligent Man-Machine Communication

    No full text
    The paper is focused on man-machine communication, which is perceived in terms of data exchange. Understanding data being exchanged is the fundamental property of intelligent communication. The main objective of this paper is to introduce the paradigm of intelligent data understanding. The paradigm stems from syntactic and semantic characterization of data and is soundly based on the paradigm of granular structuring of data and computation. The paper does not introduce a formal theory of intelligent data understanding. Instead this paradigm as well as notions of granularity, semantics and syntax are cast within the domain of music information. The domain immersion is forced by a substantial dependence of details of the paradigm of automatic data understanding on application in a given domain

    Pattern recognition : a quality of data perspective

    No full text
    xii, 296 p. ; 24 cm

    Tuning of a Knowledge-Driven Harmonization Model for Tonal Music

    No full text
    Part 5: Algorithms and Data ManagementInternational audienceThe paper presents and discusses direct and indirect tuning of a knowledge-driven harmonization model for tonal music. Automatic harmonization is a data analysis problem: an algorithm processes a music notation document and generates specific meta-data (harmonic functions). The proposed model could be seen as an Expert System with manually selected weights, based largely on the music theory. It emphasizes universality - a possibility of obtaining varied but controllable harmonies. It is directly tunable by changing the internal parameters of harmonization mechanisms, as well as an importance weight corresponding to each mechanism. The authors propose also indirect model tuning, using supervised learning with a preselected set of examples. Indirect tuning algorithms are evaluated experimentally and discussed. The proposed harmonization model is prone both to direct (expert-based) and indirect (data-driven) modifications, what allows for a mixed learning and relatively easy interpretation of internal knowledge

    INFORMATION STRUCTURING IN NATURAL LANGUAGE COMMUNICATION: SYNTACTICAL APPROACH

    No full text
    ABSTRACT This paper introduces a new framework for processing Natural Language statements. The parallel is drawn between the Natural Language processing and the Data Mining technology of information granulation. The formalism affords consistent representation of a well-known phenomenon of 'approximate' grammatical correctness of Natural Language statements. The approach is validated on some simple Natural Language statements and the directions for the future development of the system are outlined

    Pattern recognition: a quality of data perspective

    Get PDF
    Neosporna je činjenica da konstrukcija (izvedba) proizvoda ima velik utjecaj na ekonomičnost proizvodnje. Zbog toga se kod konstrukcijske definicije proizvoda nastoji postići izvedba, koja iziskuje minimum utroška materijala i vremena izrade na minimumu potrebne proizvodne opreme za proizvod koji odgovara određenoj namjeni i funkciji Kod razvoja i konstruiranja proizvoda, funkcija i tražena uporabna svojstva proizvoda su zahtjevi kojima konstruktor nastoji u prvom redu što bolje udovoljiti, ali kod toga treba misliti i na proizvodne postupke koji će biti primijenjen kod njegove proizvodnje, kao i na ekonomičnost tih postupaka. To znači da treba vladati širokim područjem proizvodnje i njezinim specifičnostima. U cilju ostvarenja tako postavljenog cilja, poželjno je i nužno da se konstrukcija proizvoda podvrgne tehnološkoj analizi, kako bi se utvrdila i po potrebi poboljšala tehnologičnost proizvoda, odnosno prikladnost za proizvodnju i to prvenstveno s gledišta mogućnosti vlastitih proizvodnih pogona i kooperacije, koja će sudjelovati u njegovoj proizvodnji. U praktičnom dijelu ovog rada projektira se tehnološki proces za „STEUERSCHEIBE 13“ – 20“ ENTLADER. Prema danom nacrtu izrađen je tehnološki postupak, radni nalog i kontrola.There is no doubt that the construction (design) of products has a great influence on the economy of production. Therefore, in the structural definition of the product aims to achieve performance, requiring a minimum of material consumption and processing time to a minimum necessary manufacturing equipment for the product that suits a particular purpose and function in development and product design, function and required performance characteristics of the product are the requirements that the designer is trying to solve on the best way, but in addition, bear in mind the manufacturing processes that will be applied in its production, as well as the cost-effectiveness of these procedures. This means that you should rule over large areas of production and its specifics. In order to achieve this goal, it is desirable and necessary that the structure of the product is subjected to analysis technology, in order to determine and if necessary improved technology of products or the suitability for production, mainly from the viewpoint of possibility of its own production facilities and cooperation, which will participate in the its manufacture. In the practical part of this paper designs the technological process "STEUERSCHEIBE 13" - 20 "ENTLADER. According to the draft it is made the technological process, work order and control

    Fuzzy Cognitive Map Reconstruction: Dynamics Versus History

    No full text
    This study is concerned with a fundamental issue of time series representation for modeling and prediction with Fuzzy Cognitive Maps. We introduce two distinct time series representation schemes for Fuzzy Cognitive Map design. The first method is based on the temporal relationships, namely time series amplitude, amplitude change, and change of amplitude change (dynamics perspective). The second scheme is based on three consecutive historical observations: present value, past value and before past value (history perspective, 2nd order relationships). Introduced procedures are experimentally verified and compared on several synthetic and real-world time series of various characteristics. The history-oriented time series representation turned out to be more advantageous. Quality of FCM-based time series models and one-step-ahead predictions were measured in terms of Mean Squared Error. We have shown that models designed with history-oriented time series representation generally require less FCM nodes to be of comparable quality to models built on dynamics-oriented time series representation. As a result, with the history-oriented time series representation scheme we are able to construct simpler and better models

    Fuzzy Cognitive Map-Driven Comprehensive Time-Series Classification

    No full text
    This article presents a comprehensive approach for time-series classification. The proposed model employs a fuzzy cognitive map (FCM) as a classification engine. Preprocessed input data feed the employed FCM. Map responses, after a postprocessing procedure, are used in the calculation of the final classification decision. The time-series data are staged using the moving-window technique to capture the time flow in the training procedure. We use a backward error propagation algorithm to compute the required model hyperparameters. Four model hyperparameters require tuning. Two are crucial for the model construction: 1) FCM size (number of concepts) and 2) window size (for the moving-window technique). Other two are important for training the model: 1) the number of epochs and 2) the learning rate (for training). Two distinguishing aspects of the proposed model are worth noting: 1) the separation of the classification engine from pre- and post-processing and 2) the time flow capture for data from concept space. The proposed classifier joins the key advantage of the FCM model, which is the interpretability of the model, with the superior classification performance attributed to the specially designed pre- and postprocessing stages. This article presents the experiments performed, demonstrating that the proposed model performs well against a wide range of state-of-the-art time-series classification algorithms
    corecore